pacman::p_load(tmap, sf, DT, stplanr,
performance, ggpubr, tidyverse)Hands-on Exercise 3: Spatial Interaction Models
1. Overview
Spatial interaction represent the flow of people, material, or information between locations in geographical space. It encompasses everything from freight shipments, energy flows, and the global trade in rare antiquities, to flight schedules, rush hour woes, and pedestrian foot traffic.
Each spatial interaction, as an analogy for a set of movements, is composed of a discrete origin/destination pair. Each pair can be represented as a cell in a matrix where rows are related to the locations of origin and columns are related to locations of destination. Such a matrix is commonly known as an origin/destination matrix or a spatial interaction matrix.
In this hands-on exercise, we will build an OD matrix by using Passenger Volume by Origin Destination Bus Stops data set downloaded from LTA DataMall.
2. Getting Started
We will use the following R packages in this exercise:
sffor importing, integrating, processing and transforming geospatial datatidyversefor importing, integrating, wrangling and visualising datatmapfor creating thematic maps
3. Preparing the Flow Data
3.1. Importing Origin Destination data
First, we will import the Passenger Volume by Origin Destination Bus Stops data set download from LTA DataMall:
odbus <- read_csv("data/aspatial/origin_destination_bus_202310.csv")
glimpse(odbus)Rows: 5,694,297
Columns: 7
$ YEAR_MONTH <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2023-…
$ DAY_TYPE <chr> "WEEKENDS/HOLIDAY", "WEEKDAY", "WEEKENDS/HOLIDAY",…
$ TIME_PER_HOUR <dbl> 16, 16, 14, 14, 17, 17, 17, 7, 14, 14, 10, 20, 20,…
$ PT_TYPE <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE <chr> "04168", "04168", "80119", "80119", "44069", "2028…
$ DESTINATION_PT_CODE <chr> "10051", "10051", "90079", "90079", "17229", "2014…
$ TOTAL_TRIPS <dbl> 3, 5, 3, 5, 4, 1, 24, 2, 1, 7, 3, 2, 5, 1, 1, 1, 1…
It is observed that ORIGIN_PT_CODE and DESTINATION_PT_CODE are both recognised as integers. Since these are bus stop identifiers, they should be converted to factor type using the following code:
odbus$ORIGIN_PT_CODE <-
as.factor(odbus$ORIGIN_PT_CODE)
odbus$DESTINATION_PT_CODE <-
as.factor(odbus$DESTINATION_PT_CODE)
glimpse(odbus)Rows: 5,694,297
Columns: 7
$ YEAR_MONTH <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2023-…
$ DAY_TYPE <chr> "WEEKENDS/HOLIDAY", "WEEKDAY", "WEEKENDS/HOLIDAY",…
$ TIME_PER_HOUR <dbl> 16, 16, 14, 14, 17, 17, 17, 7, 14, 14, 10, 20, 20,…
$ PT_TYPE <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE <fct> 04168, 04168, 80119, 80119, 44069, 20281, 20281, 1…
$ DESTINATION_PT_CODE <fct> 10051, 10051, 90079, 90079, 17229, 20141, 20141, 1…
$ TOTAL_TRIPS <dbl> 3, 5, 3, 5, 4, 1, 24, 2, 1, 7, 3, 2, 5, 1, 1, 1, 1…
3.2. Extracting Study Data
We will extract commuting flows on weekdays between 6am and 9am.
odbus6_9 <- odbus %>%
filter(DAY_TYPE == 'WEEKDAY') %>%
filter(TIME_PER_HOUR >= 6 &
TIME_PER_HOUR <= 9) %>%
group_by(ORIGIN_PT_CODE, DESTINATION_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
datatable(odbus6_9)Save the output in rds format for future use:
write_rds(odbus6_9, "data/rds/odbus6_9.rds")3.3. Importing Geospatial Data
The following geospatial data will be used in this exercise:
BusStop: Provides location of bus stops
MPSZ-2019: Provides sub-zone bondaries based on the URA Master Plan 2019
We will use the following codes to import the two data sets into R:
busstop <- st_read(dsn = "data/geospatial",
layer = "BusStop") %>%
st_transform(crs = 3414)Reading layer `BusStop' from data source
`C:\Users\noel1\Documents\School\02. Special Sem 1\ISSS624 Geospatial Analysis\noelnomel\ISSS624\Hands_on_Ex\Hands_on_Ex03\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
busstopSimple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21 / Singapore TM
First 10 features:
BUS_STOP_N BUS_ROOF_N LOC_DESC geometry
1 22069 B06 OPP CEVA LOGISTICS POINT (13576.31 32883.65)
2 32071 B23 AFT TRACK 13 POINT (13228.59 44206.38)
3 44331 B01 BLK 239 POINT (21045.1 40242.08)
4 96081 B05 GRACE INDEPENDENT CH POINT (41603.76 35413.11)
5 11561 B05 BLK 166 POINT (24568.74 30391.85)
6 66191 B03 AFT CORFE PL POINT (30951.58 38079.61)
7 23389 B02A PEC LTD POINT (12476.9 32211.6)
8 54411 B02 BLK 527 POINT (30329.45 39373.92)
9 28531 B09 BLK 536 POINT (14993.31 36905.61)
10 96139 B01 BLK 148 POINT (41642.81 36513.9)
mpsz2019 <- st_read(dsn = "data/geospatial",
layer = "MPSZ-2019") %>%
st_transform(crs = 3414)Reading layer `MPSZ-2019' from data source
`C:\Users\noel1\Documents\School\02. Special Sem 1\ISSS624 Geospatial Analysis\noelnomel\ISSS624\Hands_on_Ex\Hands_on_Ex03\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
mpsz2019Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21 / Singapore TM
First 10 features:
SUBZONE_N SUBZONE_C PLN_AREA_N PLN_AREA_C REGION_N
1 MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
2 INSTITUTION HILL RVSZ05 RIVER VALLEY RV CENTRAL REGION
3 ROBERTSON QUAY SRSZ01 SINGAPORE RIVER SR CENTRAL REGION
4 JURONG ISLAND AND BUKOM WISZ01 WESTERN ISLANDS WI WEST REGION
5 FORT CANNING MUSZ02 MUSEUM MU CENTRAL REGION
6 MARINA EAST (MP) MPSZ05 MARINE PARADE MP CENTRAL REGION
7 SUDONG WISZ03 WESTERN ISLANDS WI WEST REGION
8 SEMAKAU WISZ02 WESTERN ISLANDS WI WEST REGION
9 SOUTHERN GROUP SISZ02 SOUTHERN ISLANDS SI CENTRAL REGION
10 SENTOSA SISZ01 SOUTHERN ISLANDS SI CENTRAL REGION
REGION_C geometry
1 CR MULTIPOLYGON (((33222.98 29...
2 CR MULTIPOLYGON (((28481.45 30...
3 CR MULTIPOLYGON (((28087.34 30...
4 WR MULTIPOLYGON (((14557.7 304...
5 CR MULTIPOLYGON (((29542.53 31...
6 CR MULTIPOLYGON (((35279.55 30...
7 WR MULTIPOLYGON (((15772.59 21...
8 WR MULTIPOLYGON (((19843.41 21...
9 CR MULTIPOLYGON (((30870.53 22...
10 CR MULTIPOLYGON (((26879.04 26...
Save the output in rds format for future use:
write_rds(mpsz2019, "data/rds/mpsz2019.rds")4. Geospatial Data Wrangling
4.1. Combining busstop and mpsz2019
We will use the following code to populate the planning subzone code from mpsz2019 sf dataframe to busstop sf dataframe:
bs_mpsz2019 <- st_intersection(busstop, mpsz2019) %>%
select(BUS_STOP_N, SUBZONE_C) %>%
st_drop_geometry()
datatable(bs_mpsz2019)st_intersection() is used to perform point and polygon overlay, output is in point sf object. Five bus stops are dropped from the results as they are outside Singapore’s boundaries.
Save the output in rds format for future use:
write_rds(bs_mpsz2019, "data/rds/bs_mpsz2019.rds")Next, we will append the planning subzone code from bs_mpsz2019 onto the odbus6_9 dataframe:
od_data <- left_join(odbus6_9, bs_mpsz2019,
by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE,
ORIGIN_SZ = SUBZONE_C,
DESTIN_BS = DESTINATION_PT_CODE)Check for duplicates using the following code:
duplicates <- od_data %>%
group_by_all() %>%
filter(n() > 1) %>%
ungroup
glimpse(duplicates)Rows: 1,186
Columns: 4
$ ORIGIN_BS <chr> "11009", "11009", "11009", "11009", "11009", "11009", "11009…
$ DESTIN_BS <fct> 01341, 01341, 01411, 01411, 01421, 01421, 01511, 01511, 0152…
$ TRIPS <dbl> 1, 1, 4, 4, 17, 17, 19, 19, 2, 2, 1, 1, 11, 11, 24, 24, 1, 1…
$ ORIGIN_SZ <chr> "QTSZ01", "QTSZ01", "QTSZ01", "QTSZ01", "QTSZ01", "QTSZ01", …
Duplicate records are found in the dataframe. We will use the following code to retain the unique records:
od_data <- unique(od_data)Next, we update the od_data dataframe with the planning subzone codes:
od_data <- left_join(od_data, bs_mpsz2019,
by = c("DESTIN_BS" = "BUS_STOP_N"))duplicates <- od_data %>%
group_by_all() %>%
filter(n() > 1) %>%
ungroup
glimpse(duplicates)Rows: 1,350
Columns: 5
$ ORIGIN_BS <chr> "01013", "01013", "01112", "01112", "01112", "01112", "01121…
$ DESTIN_BS <chr> "51071", "51071", "51071", "51071", "53041", "53041", "51071…
$ TRIPS <dbl> 2, 2, 66, 66, 4, 4, 8, 8, 1, 1, 1, 1, 24, 24, 6, 6, 4, 4, 21…
$ ORIGIN_SZ <chr> "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", …
$ SUBZONE_C <chr> "CCSZ01", "CCSZ01", "CCSZ01", "CCSZ01", "BSSZ01", "BSSZ01", …
od_data <- unique(od_data)od_data <- od_data %>%
rename(DESTIN_SZ = SUBZONE_C) %>%
drop_na() %>%
group_by(ORIGIN_SZ, DESTIN_SZ) %>%
summarise(AM_PEAK = sum(TRIPS))
od_data# A tibble: 21,079 × 3
# Groups: ORIGIN_SZ [310]
ORIGIN_SZ DESTIN_SZ AM_PEAK
<chr> <chr> <dbl>
1 AMSZ01 AMSZ01 2694
2 AMSZ01 AMSZ02 10591
3 AMSZ01 AMSZ03 14980
4 AMSZ01 AMSZ04 3106
5 AMSZ01 AMSZ05 7734
6 AMSZ01 AMSZ06 2306
7 AMSZ01 AMSZ07 1824
8 AMSZ01 AMSZ08 2734
9 AMSZ01 AMSZ09 2300
10 AMSZ01 AMSZ10 164
# ℹ 21,069 more rows
Save the output in rds format for future use:
write_rds(od_data, "data/rds/od_data.rds")5. Visualising Spatial Interaction
We will prepare a desire line using the stplanr package.
5.1. Removing Intra-zonal Flows
We will use the following code to remove intra-zonal flows:
od_data1 <- od_data[od_data$ORIGIN_SZ != od_data$DESTIN_SZ,]5.2. Creating Desire Lines
We will use od2line() from the stplanr package to create desire lines:
flow_line <- od2line(flow = od_data1,
zones = mpsz2019,
zone_code = "SUBZONE_C")5.3. Visualising Desire Lines
We use the following code to visualise the desire lines:
tm_shape(mpsz2019) +
tm_polygons() +
flow_line%>%
tm_shape() +
tm_lines(lwd = "AM_PEAK",
style = "quantile",
scale = c(0.1,1,3,5,7,10),
n = 6,
alpha = 0.3)
When the data flow is very messy and highly skewed, focusing on selected flows may give more clarity. For example, we can focus on flow greater than or equal to 5000:
tm_shape(mpsz2019) +
tm_polygons() +
flow_line %>%
filter(AM_PEAK >= 5000) %>%
tm_shape() +
tm_lines(lwd = "AM_PEAK",
style = "quantile",
scale = c(0.1,1,3,5,7,10),
n = 6,
alpha = 0.3)